{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Assignment-01-附加题-极具挑战性的题目，基础一般的同学请把前边的做好就非常非常好了，这个是一个趣味题 😝"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 基于模式匹配的对话机器人实现"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Pattern Match\n",
    "\n",
    "机器能否实现对话，这个长久以来是衡量机器人是否具有智能的一个重要标志。 Alan Turing早在其文中就提出过一个测试机器智能程度的方法，该方法主要是考察人类是否能够通过对话内容区分对方是机器人还是真正的人类，如果人类无法区分，我们就称之为具有”智能“。而这个测试，后来被大家叫做”图灵测试“，之后也被翻拍成了一步著名电影，叫做《模拟游戏》。 \n",
    "\n",
    "\n",
    "\n",
    "既然图灵当年以此作为机器是否具备智能的标志，这项任务肯定是复杂的。自从 1960s 开始，诸多科学家就希望从各个方面来解决这个问题，直到如今，都只能解决一部分问题。 目前对话机器人的建立方法有很多，今天的作业中，我们为大家提供一种快速的基于模板的对话机器人配置方式。\n",
    "\n",
    "此次作业首先希望大家能够读懂这段程序的代码，其次，在此基于我们提供的代码，**能够把它改造成汉语版本，实现对话效果。**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```\n",
    "Pattern: (我想要A)\n",
    "Response: (如果你有 A，对你意味着什么呢？)\n",
    "\n",
    "Input: (我想要度假)\n",
    "Response: (如果你有度假，对你意味着什么呢？)\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "为了实现模板的判断和定义，我们需要定义一个特殊的符号类型，这个符号类型就叫做\"variable\"， 这个\"variable\"用来表示是一个占位符。例如，定义一个目标: \"I want X\"， 我们可以表示成  \"I want ?X\", 意思就是?X是一个用来占位的符号。\n",
    "\n",
    "如果输入了\"I want holiday\"， 在这里 'holiday' 就是 '?X'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 判断是否为占位符 ?X\n",
    "def is_variable(pat:str) -> bool:\n",
    "    return pat.startswith('?') and all(s.isalpha() for s in pat[1:])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 判断saying是否与模板 pattern 相匹配\n",
    "def pat_match(pattern: list, saying: list) -> bool:\n",
    "    if is_variable(pattern[0]): return True\n",
    "    else:\n",
    "        if pattern[0] != saying[0]: return False\n",
    "        else:\n",
    "            return pat_match(pattern[1:], saying[1:])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 例如"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pat_match('I want ?X'.split(), \"I want holiday\".split())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "False"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pat_match('I have dreamed a ?X'.split(), \"I dreamed about dog\".split())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pat_match('I dreamed about ?X'.split(), \"I dreamed about dog\".split())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 获得匹配的变量"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "以上的函数能够判断两个 pattern 是不是相符，但是我们更加希望的是获得每个variable对应的是什么值。\n",
    "\n",
    "我们对程序做如下修改:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "def pat_match(pattern: list, saying: list) -> (str, str):\n",
    "    if is_variable(pattern[0]):\n",
    "        return pattern[0], saying[0]\n",
    "    else:\n",
    "        if pattern[0] != saying[0]: return False\n",
    "        else:\n",
    "            return pat_match(pattern[1:], saying[1:])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "pattern = 'I want ?X'.split()\n",
    "saying = \"I want holiday\".split()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "('?X', 'holiday')"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pat_match(pattern, saying)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "('?X', '2+2')"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pat_match(\"?X equals ?X\".split(), \"2+2 equals 2+2\".split())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "但是，如果我们的 Pattern 中具备两个变量，那么以上程序就不能解决了，我们可以对程序做如下修改: \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "def pat_match(pattern: list, saying: list) -> list:\n",
    "    if not pattern or not saying: return []\n",
    "    \n",
    "    if is_variable(pattern[0]):\n",
    "        return [(pattern[0], saying[0])] + pat_match(pattern[1:], saying[1:])\n",
    "    else:\n",
    "        if pattern[0] != saying[0]: return []\n",
    "        else:\n",
    "            return pat_match(pattern[1:], saying[1:])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "于是，我们可以获得： "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('?X', '3'), ('?Y', '2')]"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pat_match(\"?X greater than ?Y\".split(), \"3 greater than 2\".split())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "如果我们知道了每个变量对应的是什么，那么我们就可以很方便的使用我们定义好的模板进行替换："
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "为了方便接下来的替换工作，我们新建立两个函数，一个是把我们解析出来的结果变成一个 dictionary，一个是依据这个 dictionary 依照我们的定义的方式进行替换。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 将解析出来的结果转变为 key为模板，value为值的字典\n",
    "def pat_to_dict(patterns):\n",
    "    return {k: v for k, v in patterns}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 将需要替换的的包含模板的句子还原为完整句子\n",
    "def subsitite(rule: list, parsed_rules: dict) -> list:\n",
    "    if not rule: return []\n",
    "    \n",
    "    return [parsed_rules.get(rule[0], rule[0])] + subsitite(rule[1:], parsed_rules)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [],
   "source": [
    "got_patterns = pat_match(\"I want ?X\".split(), \"I want iPhone\".split())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('?X', 'iPhone')]"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "got_patterns"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['What', 'if', 'you', 'mean', 'if', 'you', 'got', 'a', 'iPhone']"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "subsitite(\"What if you mean if you got a ?X\".split(), pat_to_dict(got_patterns))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "为了将以上输出变成一句话，也很简单，我们使用 Python 的 join 方法即可： "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [],
   "source": [
    "john_pat = pat_match('?P needs ?X'.split(), \"John needs resting\".split())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'What if you mean if you got a iPhone'"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "' '.join(subsitite(\"What if you mean if you got a ?X\".split(), pat_to_dict(got_patterns)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [],
   "source": [
    "john_pat = pat_match('?P needs ?X'.split(), \"John needs vacation\".split())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['Why', 'does', 'John', 'need', 'vacation', '?']"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "subsitite(\"Why does ?P need ?X ?\".split(), pat_to_dict(john_pat))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Why does John need vacation ?'"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "' '.join(subsitite(\"Why does ?P need ?X ?\".split(), pat_to_dict(john_pat)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "那么如果我们现在定义一些patterns，就可以实现基于模板的对话生成了:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "defined_patterns = {\n",
    "    \"I need ?X\": [\"Image you will get ?X soon\", \"Why do you need ?X ?\"], \n",
    "    \"My ?X told me something\": [\"Talk about more about your ?X\", \"How do you think about your ?X ?\"]\n",
    "}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [],
   "source": [
    "import random\n",
    "\n",
    "def get_response(saying: str) -> str:\n",
    "    \"\"\"\" please implement the code, to get the response as followings:\n",
    "    \n",
    "    >>> get_response('I need iPhone') \n",
    "    >>> Image you will get iPhone soon\n",
    "    >>> get_response(\"My mother told me something\")\n",
    "    >>> Talk about more about your monther.\n",
    "    \"\"\"\n",
    "    for key in defined_patterns:\n",
    "        pattern = pat_match(key.split(), saying.split())\n",
    "        if pattern:\n",
    "            return ' '.join(subsitite(random.choice(defined_patterns[key]).split(), pat_to_dict(pattern)))\n",
    "    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Why do you need drink ?'"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "get_response('I need drink')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'How do you think about your brother ?'"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "get_response('My brother told me something')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> **评阅点**: 运行代码，是否能够实现基本的语言问答"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Segment Match"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我们上边的这种形式，能够进行一些初级的对话了，但是我们的模式逐字逐句匹配的， \"I need iPhone\" 和 \"I need ?X\" 可以匹配，但是\"I need an iPhone\" 和 \"I need ?X\" 就不匹配了，那怎么办？ \n",
    "\n",
    "为了解决这个问题，我们可以新建一个变量类型 \"?\\*X\", 这种类型多了一个星号(\\*),表示匹配多个"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "首先，和前文类似，我们需要定义一个判断是不是匹配多个的variable"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 判断输入是否是匹配符 ?*X\n",
    "def is_pattern_segment(pattern: str) -> bool:\n",
    "    return pattern.startswith('?*') and all(a.isalpha() for a in pattern[2:])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "is_pattern_segment('?*P')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [],
   "source": [
    "from collections import defaultdict"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "然后我们把之前的 ```pat_match```程序改写成如下， 主要是增加了 ``` is_pattern_segment ```的部分. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 161,
   "metadata": {},
   "outputs": [],
   "source": [
    "fail = [True, None]\n",
    "\n",
    "def pat_match_with_seg(pattern: list, saying: list) -> list:\n",
    "    if not pattern or not saying: return []\n",
    "    \n",
    "    pat = pattern[0]\n",
    "    \n",
    "    if is_variable(pat):\n",
    "        return [(pat, saying[0])] + pat_match_with_seg(pattern[1:], saying[1:])\n",
    "    elif is_pattern_segment(pat):\n",
    "        match, index = segment_match(pattern, saying)\n",
    "        if match[0] is not None:\n",
    "            return [match] + pat_match_with_seg(pattern[1:], saying[index:])\n",
    "        else:\n",
    "            return fail\n",
    "    elif pat == saying[0]:\n",
    "        return pat_match_with_seg(pattern[1:], saying[1:])\n",
    "    else:\n",
    "        return fail"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "这段程序里比较重要的一个新函数是 ```segment_match```，这个函数输入是一个以 ```segment_pattern```开头的模式，尽最大可能进行，匹配到这个*边长*的变量对应的部分。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 162,
   "metadata": {},
   "outputs": [],
   "source": [
    "def segment_match(pattern: list, saying: list) -> (tuple, int):\n",
    "    # 先取出 匹配符 和 剩余部分\n",
    "    seg_pat, rest = pattern[0], pattern[1:]\n",
    "    seg_pat = seg_pat.replace('?*', '?')\n",
    "\n",
    "    if not rest: return (seg_pat, saying), len(saying)    \n",
    "    \n",
    "    for i, token in enumerate(saying):\n",
    "        # 使用剩余部分的第一个元素与输入的所有字符进行对比\n",
    "        # 找到相等的时候就返回其 索引与（匹配模板，根据索引取到的所有元素）\n",
    "        if rest[0] == token and is_match(rest[1:], saying[(i + 1):]):\n",
    "            return (seg_pat, saying[:i]), i\n",
    "    # 这里会有一个Bug，直接返回的话，就会把 pattern 直接匹配为整个句子\n",
    "    # return (seg_path, saying), len(saying)\n",
    "    return (None, None), 0\n",
    "\n",
    "# 判断模板与待匹配的剩余部分是否 相等\n",
    "def is_match(rest: list, saying: list) -> bool:\n",
    "    if not rest and not saying:\n",
    "        return True\n",
    "    if not all(a.isalpha() for a in rest[0]):\n",
    "        return True\n",
    "    if rest[0] != saying[0]:\n",
    "        return False\n",
    "    return is_match(rest[1:], saying[1:])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 163,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(('?P', ['My', 'dog', 'and', 'my', 'cat']), 5)"
      ]
     },
     "execution_count": 163,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "segment_match('?*P is very good'.split(), \"My dog and my cat is very good\".split())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "现在，我们就可以做到以下的匹配模式了: "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 146,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('?P', ['My', 'dog']), ('?X', ['my', 'cat', 'is', 'very', 'cute'])]"
      ]
     },
     "execution_count": 146,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pat_match_with_seg('?*P is very good and ?*X'.split(), \"My dog is very good and my cat is very cute\".split())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "如果我们继续定义一些模板，我们进行匹配，就能够进行更加复杂的问题了: "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 147,
   "metadata": {},
   "outputs": [],
   "source": [
    "response_pair = {\n",
    "    'I need ?*X': [\n",
    "        \"Why do you neeed ?*X\"\n",
    "    ],\n",
    "    \"I dont like my ?*X\": [\"What bad things did ?*X do for you?\"]\n",
    "}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 148,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('?X', ['an', 'iPhone'])]"
      ]
     },
     "execution_count": 148,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pat_match_with_seg('I need ?*X'.split(), \n",
    "                  \"I need an iPhone\".split())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 149,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['Why', 'do', 'you', 'neeed', 'an iPhone']"
      ]
     },
     "execution_count": 149,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "subsitite(\"Why do you neeed ?X\".split(), pat_to_dict(pat_match_with_seg('I need ?*X'.split(), \n",
    "                  \"I need an iPhone\".split())))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " 我们会发现，pat_to_dict在这个场景下会有有一点小问题，没关系，修正一些: "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 150,
   "metadata": {},
   "outputs": [],
   "source": [
    "def pat_to_dict(patterns):\n",
    "    return {k: ' '.join(v) if isinstance(v, list) else v for k, v in patterns}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 151,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['Why', 'do', 'you', 'neeed', 'an iPhone']"
      ]
     },
     "execution_count": 151,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "subsitite(\"Why do you neeed ?X\".split(), pat_to_dict(pat_match_with_seg('I need ?*X'.split(), \n",
    "                  \"I need an iPhone\".split())))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "如果我们定义这样的一个模板:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 152,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "('?*X hello ?*Y', 'Hi, how do you do')"
      ]
     },
     "execution_count": 152,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "(\"?*X hello ?*Y\", \"Hi, how do you do\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 153,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['Hi,', 'how', 'do', 'you', 'do?']"
      ]
     },
     "execution_count": 153,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "subsitite(\"Hi, how do you do?\".split(), pat_to_dict(pat_match_with_seg('?*X hello ?*Y'.split(), \n",
    "                  \"I am mike, hello \".split())))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 现在是你的时间了"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 154,
   "metadata": {},
   "outputs": [],
   "source": [
    "#我们给大家一些例子: \n",
    "    \n",
    "rules = {\n",
    "    \"?*X hello ?*Y\": [\"Hi, how do you do?\"],\n",
    "    \"I was ?*X\": [\"Were you really ?X ?\", \"I already knew you were ?X .\"]\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 问题1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "编写一个程序, ```get_response(saying, response_rules)```输入是一个字符串 + 我们定义的 rules，例如上边我们所写的 pattern， 输出是一个回答。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> **评阅点**： get_response是否定义完整，功能是否齐备"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 166,
   "metadata": {},
   "outputs": [],
   "source": [
    "import random\n",
    "\n",
    "def get_response(saying: str, response_rules: dict) -> str:\n",
    "    say_list = saying.split()\n",
    "    for key in response_rules:\n",
    "        pattern_list = key.split()\n",
    "        pattern = pat_match_with_seg(pattern_list, say_list)\n",
    "\n",
    "        if pattern[0] != True:\n",
    "            return  ' '.join(subsitite(random.choice(response_rules[key]).split(), pat_to_dict(pattern)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 167,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Hi, how do you do?'"
      ]
     },
     "execution_count": 167,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "get_response(\"hi, Bob, hello !\", response_rules=rules)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 168,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Were you really already got an apple ?'"
      ]
     },
     "execution_count": 168,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "get_response('I was already got an apple', rules)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 问题2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "改写以上程序，将程序变成能够支持中文输入的模式。\n",
    "*提示*: 你可能需要用到 jieba 分词"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> **评阅点**： 是否能够支持中文问题，jieba分词的使用"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 197,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 检查输入句子是否为中文字符\n",
    "def is_contains_chinese(strs: str) -> bool:\n",
    "    for _char in strs:\n",
    "        if not '\\u4e00' <= _char <= '\\u9fa5':\n",
    "            return True\n",
    "    return False"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 198,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 198,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "is_contains_chinese('?*x我想要?*y和?*z')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 234,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "([0, 6, 10], [5])"
      ]
     },
     "execution_count": 234,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from jieba import cut\n",
    "\n",
    "def find_pat_index(strs: str) -> list:\n",
    "    index_list = []\n",
    "    pat_count = strs.count('?')\n",
    "    \n",
    "    if pat_count < 1:\n",
    "        pass\n",
    "    elif pat_count == 1:\n",
    "        index_list.append(strs.index('?'))\n",
    "    else:\n",
    "        pat_index = strs.index('?')\n",
    "        index_list.append(pat_index)\n",
    "        while pat_count > 1:\n",
    "            sub_strs = strs[pat_index+3:]\n",
    "            pat_index += 3 + sub_strs.index('?')\n",
    "            index_list.append(pat_index)\n",
    "            pat_count -= 1\n",
    "    return index_list\n",
    "\n",
    "def find_pat_index_from_res(strs: str) -> list:\n",
    "    index_list = []\n",
    "    pat_count = strs.count('?')\n",
    "    \n",
    "    if pat_count < 1:\n",
    "        pass\n",
    "    elif pat_count == 1:\n",
    "        index_list.append(strs.index('?'))\n",
    "    else:\n",
    "        pat_index = strs.index('?')\n",
    "        index_list.append(pat_index)\n",
    "        while pat_count > 1:\n",
    "            sub_strs = strs[pat_index+2:]\n",
    "            pat_index += 2 + sub_strs.index('?')\n",
    "            index_list.append(pat_index)\n",
    "            pat_count -= 1\n",
    "    return index_list\n",
    "            \n",
    "\n",
    "find_pat_index('?*x我想要?*y和?*z'), find_pat_index_from_res('为什么你想?y')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 240,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(['?*x', '我', '记得', '?*y'], ['你', '觉得', '?y', '有', '什么', '意义', '呢', '？'])"
      ]
     },
     "execution_count": 240,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def cut_pattern_from_chinese(strs: str, n:int) -> list:\n",
    "    strs_list = []\n",
    "    pat_index = find_pat_index(strs)\n",
    "    if not pat_index:\n",
    "        strs_list += list(cut(strs))\n",
    "        return strs_list\n",
    "    if len(pat_index) == 1:\n",
    "        if strs[:pat_index[0]]:\n",
    "            strs_list += list(cut(strs[:pat_index[0]]))\n",
    "        strs_list.append(strs[pat_index[0]: pat_index[0]+n])\n",
    "        strs_list += list(cut(strs[pat_index[0]+n:]))\n",
    "        return strs_list\n",
    "    elif len(pat_index) == 2:\n",
    "        if strs[:pat_index[0]]:\n",
    "            strs_list += list(cut(strs[:pat_index[0]]))\n",
    "        strs_list.append(strs[pat_index[0]: pat_index[0]+n])\n",
    "        strs_list += list(cut(strs[pat_index[0]+n:pat_index[1]]))\n",
    "        strs_list.append(strs[pat_index[1]: pat_index[1]+n])\n",
    "        sub_strs = strs[pat_index[1]+n:]\n",
    "        strs_list += list(cut(sub_strs))\n",
    "        return strs_list\n",
    "    else:\n",
    "        for i in range(len(pat_index)):\n",
    "            if i == 0:\n",
    "                if strs[:pat_index[i]]:\n",
    "                    strs_list += list(cut(strs[:pat_index[i]]))\n",
    "                strs_list.append(strs[pat_index[i]: pat_index[i]+n])\n",
    "            elif i == len(pat_index)-1:    \n",
    "                sub_strs = strs[pat_index[i]+n:]\n",
    "                strs_list.append(strs[pat_index[i]:pat_index[i]+n])\n",
    "                if sub_strs:\n",
    "                    strs_list += list(cut(sub_strs))\n",
    "            else:\n",
    "                sub_strs = strs[pat_index[i-1]+n:pat_index[i]]\n",
    "                strs_list.append(strs[pat_index[i]:pat_index[i]+n])\n",
    "                strs_list += list(cut(sub_strs))\n",
    "        return strs_list\n",
    "    \n",
    "    \n",
    "cut_pattern_from_chinese('?*x我记得?*y', 3), cut_pattern_from_chinese('你觉得?y有什么意义呢？', 2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 244,
   "metadata": {},
   "outputs": [],
   "source": [
    "import random\n",
    "\n",
    "def get_response_v2(saying: str, response_rules: dict) -> str:\n",
    "    say_list = []\n",
    "    pattern_list = []\n",
    "    if is_contains_chinese(saying):\n",
    "        say_list = list(cut(saying))\n",
    "    else:\n",
    "        say_list = saying.split()\n",
    "    for key in response_rules:\n",
    "        if is_contains_chinese(key):\n",
    "            pattern_list = cut_pattern_from_chinese(key, 3)\n",
    "        else:\n",
    "            pattern_list = key.split()\n",
    "        pattern = pat_match_with_seg(pattern_list, say_list)\n",
    "\n",
    "        if pattern[0] != True:\n",
    "            if is_contains_chinese(key):\n",
    "                return ''.join(subsitite(cut_pattern_from_chinese(random.choice(response_rules[key]), 2), pat_to_dict(pattern)))\n",
    "            else:\n",
    "                return  ' '.join(subsitite(random.choice(response_rules[key]).split(), pat_to_dict(pattern)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 245,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'我听到你这么说， 也很难过'"
      ]
     },
     "execution_count": 245,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 借用下面的 rule_responses 测试一下 - 有些回答有点人工智障的感觉 - 后续继续努力改一下\n",
    "get_response_v2(\"听到这个结果我很难过，因为我准备了很久。\", rule_responses)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 226,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'?x': '您好 ，', '?y': '问 你 一件 事 。'}"
      ]
     },
     "execution_count": 226,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pat_to_dict([('?x', ['您好', '，']), ('?y', ['问', '你', '一件', '事', '。'])])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 问题3"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "多设计一些模式，让这个程序变得更好玩，多和大家交流，看看大家有什么好玩的模式"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> **评阅点**: 是否设计了更有趣的点？"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 问题4"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "1. 这样的程序有什么优点？有什么缺点？你有什么可以改进的方法吗？ \n",
    "2. 什么是数据驱动？数据驱动在这个程序里如何体现？\n",
    "3. 数据驱动与 AI 的关系是什么？ \n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ans: {\n",
    "1. 这样的程序的优点是实现简单，只需要匹配对应的模板就能生成回答。 缺点就是生成的回答有时候文不对题，还会存在一些语法错误。改进的方法是对匹配到的模板中一些词语根据正常的文字语法进行替换。\n",
    "\n",
    "2. 数据驱动就是通过研究大量的数据，提炼出一些规则，并且用这些规则来对提问进行回答。数据驱动在程序中的提现就是pattern模板中的句子。\n",
    "\n",
    "3. 数据驱动和AI的关系是AI技术之中使用数据来完成算法的训练，而数据驱动方式又可以使用AI中的一些算法来增强效果，属于相辅相成的关系。\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> **评阅点**: 是否提出了比较有意义的问题？对于数据驱动是否理解到位？"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "一些参考 pattern"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 195,
   "metadata": {},
   "outputs": [],
   "source": [
    "rule_responses = {\n",
    "    '?*x hello ?*y': ['How do you do', 'Please state your problem'],\n",
    "    '?*x I want ?*y': ['what would it mean if you got ?y', 'Why do you want ?y', 'Suppose you got ?y soon'],\n",
    "    '?*x if ?*y': ['Do you really think its likely that ?y', 'Do you wish that ?y', 'What do you think about ?y', 'Really-- if ?y'],\n",
    "    '?*x no ?*y': ['why not?', 'You are being a negative', 'Are you saying \\'No\\' just to be negative?'],\n",
    "    '?*x I was ?*y': ['Were you really', 'Perhaps I already knew you were ?y', 'Why do you tell me you were ?y now?'],\n",
    "    '?*x I feel ?*y': ['Do you often feel ?y ?', 'What other feelings do you have?'],\n",
    "    '?*x你好?*y': ['你好呀', '请告诉我你的问题'],\n",
    "    '?*x我想?*y': ['你觉得?y有什么意义呢？', '为什么你想?y', '你可以想想你很快就可以?y了'],\n",
    "    '?*x我想要?*y': ['?x想问你，你觉得?y有什么意义呢?', '为什么你想?y', '?x觉得... 你可以想想你很快就可以有?y了', '你看?x像?y不', '我看你就像?y'],\n",
    "    '?*x喜欢?*y': ['喜欢?y的哪里？', '?y有什么好的呢？', '你想要?y吗？'],\n",
    "    '?*x讨厌?*y': ['?y怎么会那么讨厌呢?', '讨厌?y的哪里？', '?y有什么不好呢？', '你不想要?y吗？'],\n",
    "    '?*xAI?*y': ['你为什么要提AI的事情？', '你为什么觉得AI要解决你的问题？'],\n",
    "    '?*x机器人?*y': ['你为什么要提机器人的事情？', '你为什么觉得机器人要解决你的问题？'],\n",
    "    '?*x对不起?*y': ['不用道歉', '你为什么觉得你需要道歉呢?'],\n",
    "    '?*x我记得?*y': ['你经常会想起这个吗？', '除了?y你还会想起什么吗？', '你为什么和我提起?y'],\n",
    "    '?*x如果?*y': ['你真的觉得?y会发生吗？', '你希望?y吗?', '真的吗？如果?y的话', '关于?y你怎么想？'],\n",
    "    '?*x我?*z梦见?*y':['真的吗? --- ?y', '你在醒着的时候，以前想象过?y吗？', '你以前梦见过?y吗'],\n",
    "    '?*x妈妈?*y': ['你家里除了?y还有谁?', '嗯嗯，多说一点和你家里有关系的', '她对你影响很大吗？'],\n",
    "    '?*x爸爸?*y': ['你家里除了?y还有谁?', '嗯嗯，多说一点和你家里有关系的', '他对你影响很大吗？', '每当你想起你爸爸的时候， 你还会想起其他的吗?'],\n",
    "    '?*x我愿意?*y': ['我可以帮你?y吗？', '你可以解释一下，为什么想?y'],\n",
    "    '?*x我很难过，因为?*y': ['我听到你这么说， 也很难过', '?y不应该让你这么难过的'],\n",
    "    '?*x难过?*y': ['我听到你这么说， 也很难过',\n",
    "                 '不应该让你这么难过的，你觉得你拥有什么，就会不难过?',\n",
    "                 '你觉得事情变成什么样，你就不难过了?'],\n",
    "    '?*x就像?*y': ['你觉得?x和?y有什么相似性？', '?x和?y真的有关系吗？', '怎么说？'],\n",
    "    '?*x和?*y都?*z': ['你觉得?z有什么问题吗?', '?z会对你有什么影响呢?'],\n",
    "    '?*x和?*y一样?*z': ['你觉得?z有什么问题吗?', '?z会对你有什么影响呢?'],\n",
    "    '?*x我是?*y': ['真的吗？', '?x想告诉你，或许我早就知道你是?y', '你为什么现在才告诉我你是?y'],\n",
    "    '?*x我是?*y吗': ['如果你是?y会怎么样呢？', '你觉得你是?y吗', '如果你是?y，那一位着什么?'],\n",
    "    '?*x你是?*y吗':  ['你为什么会对我是不是?y感兴趣?', '那你希望我是?y吗', '你要是喜欢， 我就会是?y'],\n",
    "    '?*x你是?*y' : ['为什么你觉得我是?y'],\n",
    "    '?*x因为?*y' : ['?y是真正的原因吗？', '你觉得会有其他原因吗?'],\n",
    "    '?*x我不能?*y': ['你或许现在就能?*y', '如果你能?*y,会怎样呢？'],\n",
    "    '?*x我觉得?*y': ['你经常这样感觉吗？', '除了到这个，你还有什么其他的感觉吗？'],\n",
    "    '?*x我?*y你?*z': ['其实很有可能我们互相?y'],\n",
    "    '?*x你为什么不?*y': ['你自己为什么不?y', '你觉得我不会?y', '等我心情好了，我就?y'],\n",
    "    '?*x好的?*y': ['好的', '你是一个很正能量的人'],\n",
    "    '?*x嗯嗯?*y': ['好的', '你是一个很正能量的人'],\n",
    "    '?*x不嘛?*y': ['为什么不？', '你有一点负能量', '你说 不，是想表达不想的意思吗？'],\n",
    "    '?*x不要?*y': ['为什么不？', '你有一点负能量', '你说 不，是想表达不想的意思吗？'],\n",
    "    '?*x有些人?*y': ['具体是哪些人呢?'],\n",
    "    '?*x有的人?*y': ['具体是哪些人呢?'],\n",
    "    '?*x某些人?*y': ['具体是哪些人呢?'],\n",
    "    '?*x每个人?*y': ['我确定不是人人都是', '你能想到一点特殊情况吗？', '例如谁？', '你看到的其实只是一小部分人'],\n",
    "    '?*x所有人?*y': ['我确定不是人人都是', '你能想到一点特殊情况吗？', '例如谁？', '你看到的其实只是一小部分人'],\n",
    "    '?*x总是?*y': ['你能想到一些其他情况吗?', '例如什么时候?', '你具体是说哪一次？', '真的---总是吗？'],\n",
    "    '?*x一直?*y': ['你能想到一些其他情况吗?', '例如什么时候?', '你具体是说哪一次？', '真的---总是吗？'],\n",
    "    '?*x或许?*y': ['你看起来不太确定'],\n",
    "    '?*x可能?*y': ['你看起来不太确定'],\n",
    "    '?*x他们是?*y吗？': ['你觉得他们可能不是?y？'],\n",
    "    '?*x': ['很有趣', '请继续', '我不太确定我很理解你说的, 能稍微详细解释一下吗?']\n",
    "}\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
   "autoclose": false,
   "autocomplete": true,
   "bibliofile": "biblio.bib",
   "cite_by": "apalike",
   "current_citInitial": 1,
   "eqLabelWithNumbers": true,
   "eqNumInitial": 1,
   "hotkeys": {
    "equation": "Ctrl-E",
    "itemize": "Ctrl-I"
   },
   "labels_anchors": false,
   "latex_user_defs": false,
   "report_style_numbering": false,
   "user_envs_cfg": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
